1 Setup

Let’s load the pidiq2 package.

#Installing Dependencies:
#PIDIQ2 requires the magick and EBImage packages:

#1) Magick:
#If using linux, will have to install magick++ linux-os libraries:
#Open a terminal and input the following command: sudo apt-get install libmagick++-dev

#Next, install the R-magick library:
install.packages('magick')


#2) EBImage:
#Use the remotes package to install_github():
install.packages(remotes)
remotes::install_github('aoles/EBImage')


####
#Install PIDIQ2 from https://github.com/cedatorma/pidiq2:
remotes::install_github("cedatorma/pidiq2")


#And lastly, load the PIDIQ2 package:
library(pidiq2)

Get paths to images from the ./photos/ directory:

img_dir <- './photos/'
img_paths <- paste0(img_dir, list.files(img_dir))

img_paths
##  [1] "./photos/DSC_0788.JPG"             "./photos/DSC_0788.JPG_cropped.jpg"
##  [3] "./photos/DSC_0789.JPG"             "./photos/DSC_0789.JPG_cropped.jpg"
##  [5] "./photos/DSC_0790.JPG"             "./photos/DSC_0790.JPG_cropped.jpg"
##  [7] "./photos/DSC_0791.JPG"             "./photos/DSC_0791.JPG_cropped.jpg"
##  [9] "./photos/DSC_0792.JPG"             "./photos/DSC_0792.JPG_cropped.jpg"
## [11] "./photos/DSC_0793.JPG"             "./photos/DSC_0793.JPG_cropped.jpg"
## [13] "./photos/DSC_0794.JPG"             "./photos/DSC_0794.JPG_cropped.jpg"
## [15] "./photos/DSC_0795.JPG"             "./photos/DSC_0795.JPG_cropped.jpg"

Let’s choose the first image to work with: ./photos/DSC_0788.JPG ./photos/DSC_0788.JPG


2 Multi-Plant Image Test-Run

The image above is a flat containing 4 x 10 mature A. thaliana plants. Fortunately, PIDIQv2 has been designed to automatically process these images - but to do so it first requires that the image only contains the flat/plate region and that plants are seeded in equally spaced intervals.

As this image contains non-flat background, we have to perform a preliminary pre-cropping step to remove them. You could use your favourite image editor to do this, but I’ve provided some tools for streamlining image pre-processing into the PIDIQv2 R analysis which are located in the ./guis/ directory.

2.1 Image Pre-Processing - The PIDIQv2 Plate Cropper GUI

You can use the ./guis/plate_creopper/plate_crop_launcher.R, a GUI written in gWidgets2 using the RGtk2 package (note, this requires installation of the necessary packages and their dependencies, described below):

#It can be called calling it from the command line using R-Script:
system(paste('Rscript', './guis/plate_cropper/plate_crop_launcher.R'))
  • In the future, this function will be implemented as a Shiny app, which will avoid the tricky installation of dependencies.

But, as a teaser, I will show you how plate_crop_launcher.R works. Integrating image pre-processing features directly into the pidiq_v2 (v2 = R?) workflow keeps analysis streamlined and efficient.

1) Starting the Plate Cropper GUI

2) Selecting an Image to Crop

3) Loading Image

4) Cropping


The new cropped image is now saved in ./photos/DSC_0788.JPG_cropped.jpg.

2.1.1 GUI Installation - Dependencies

If you are interested in using plate_crop_launcher.R in its current state, you will need to install the following R packages:

  • remotes - For installation of non-CRAN packages, i.e. from github repos.

  • magick - A versatile image manipulation library. (Requires the libmagick++-dev library to be installed first).

  • gWidgets2 - A high-level API for generating the GUIs using calls to underlying, lower-level GUI generation libraries. You will also need to install GTK2 packages, in the following order:

    • RGtk2 - the R interface to the GTK+ library. (Requires Gtk2+ version 2.8.0: gtk2.0 and the libgtk2.0-dev libraries to be installed first).

    • cairoDevice - required by RGtk2 to display R-graphics plots.

    • gWidgets2RGtk2 - the toolkit implementation of gWidgets2 for RGtk2.

  • Notes:

    • See (https://gist.github.com/sebkopf/9405675) for details on how to troubleshoot the installation of RGtk2 dependencies for Windows and Mac.

    • If you are using linux, to install the required back-end depdencies for the magic and Gtk2 R packages I recommend using the sudo apt-get install libmagick++-dev gtk2.0 build-essential libgtk2.0-dev command in the bash shell.

#GUI Dependencies Package Installation:

#1) remotes:
install.packages('remotes', repos = 'http://cran.us.r-project.org')

#2) magick:
remotes::install_github('ropensci/magick', repos = 'https://cloud.r-project.org')

#3) gWidgets2:
remotes::install_github('jverzani/gWidgets2')

#4) RGtk2 (after installing the Gtk2+ libraries, see above):
remotes::install_github('lawremi/RGtk2/RGtk2')

#5) cairoDevice:
remotes::install_github("lawremi/cairoDevice")

#6) gWidgetsRGtk2:
remotes::install_github('jverzani/gWidgets2RGtk2')

2.2 Running PIDIQv2

Now that we have cropped the image down to just the flat, let’s test-run pidiq_calc2() on it:

#Note: pidiq_calc2() will output results directly to the console, so assign them to a variable:

res_test <- pidiq_calc2(img_file = paste0(img_paths[1], '_cropped.jpg'), #Path to the cropped flat image.
                        output_dir = './test_pidiq2/', #Keeping the output directory to default.
                        gsize_cut = NULL, #Do not set for preliminary run.
                        test_plt = TRUE, #Enabling the generation of intermediate test images to see if image segmentation is ok.
                        col_lab = 1:10, #Column labelling.
                        row_lab = LETTERS[1:4], #Row labelling.
                        filter = 'arabidopsis', #Spectral filtering preset to use.
                        msg = TRUE #Enable verbose console progress-messages.
            )
  • When calling pidiq_calc2(), probably the most important parameter, besides img_file, will be filter, which determines the preset spectral filtering h,s,v parameters used for calling healthy/diseased plant pixels, and is the core of the PIDIQ methodology. These parameters should be provided in a tab-delimited table (see ./pidiq_presets.csv from the pidiq2 github), and pidiq_calc2() expects to find it in the current working directory of the R session - if it doesn’t it will throw an error. This is probably the best way to keep track of analyses and make them reproducible.

  • Work is in progress on a GUI for setting PIDIQ spectral filtering presets. It should be available soon - CBT - 26/03/2023

As pidiq_calc2() runs we’ll see basic progress messages printed to the console, including a warning that we have not set the segmentation group size threshold, i.e. the gsize_cut parameter.

2.2.1 Setting Segmentation Group Size Threshold

I recommend that on your first run you do not set the segmentation group size threshold, gsize_cut. This parameter is used for filtering out background-noise, e.g. perlite and other oddities of green/yellow hued plastic wells which might have passed the initial stage of spectral filtering. Once PidiqV2 gets used over more experimental setups, then it might be possible to narrow down on a good initial parameter to use. The only recommendation I could make is if your photos are of individual plant leaves against a pure, non-reflective black background (not-felt), then you can set gsize_cut = 0.

To help you set gsize_cut, you will get a rank-ordered plot of segmentation groups sorted by decreasing size (in # of pixels on the y-axis), from which a cutoff can be selected and provided: either by keyboard-entry via console prompt, or selection directly from the plot (after entering interactive-mode).

How would you go about interpreting this plot? Well, you will notice a vertical line which intersects the x-axis at the rank order equal to the number of plants/wells on the plate (= col_lab * row_lab). This line represents an ideal scenario where each plant/seedling can be assigned to an individual segmentation group, i.e. where no plants overlap and continuous borders can be drawn around filtered pixels. But, this is rarely likely to be the case.

You will also see that the plot has a strong knee where segmentation group sizes rapidly drop-off, around 200-250 pixels… This would be a good place to start with selecting gsize_cut. I will be a little more liberal and choose a threshold of <= 100 pixels to avoid the possibility of excluding any plant tissue. We can always fine-tune it later, and then supply gsize_cut for consistently processing all remaining flat/plate images.

  • Note: It would be good to provide the number of crops/plants in the image, the average crop size in pixels, and calculating segmentation group size as a proportion of image/crop-area of a flat.

2.3 Output

Before we get to the output files, there is additional useful information that pidiq_calc2() provides in the console:

Ideally, if we skipped the gsize_cut selection stage above, then it would only take ~16.5 seconds to process this plate of ~40 plants. Not bad!

  • Note at Step 5) only 8 segmentation groups could be assigned to 7 distinct crop regions. Keep in mind that a ‘crop’ corresponds to an individual sub-region of the image where pidiq_calc2() expects to find a single plant. So, only 7 / 40 segmentation groups could be reliably assigned to individual plants.

  • In Step 6) we read that 2 Segmentation groups have been merged across 39 crops, from which we can conclude that there is extensive overlap of plants, which will likely affect the accuracy of our downstream pidiq summary stats.

2.3.1 PIDIQ Summary Stats

Keeping this in mind, we can also take a look at the results stored in the res_test data.frame:

res_test


Each row corresponds to an individual plant extracted from the flat image and its corresponding pidiq stats.

The columns, in order:

pidiq_calc2() - Format of Output and PIDIQ Summary Stats Calculated for Multi-Plant Images
Column Name Description
File Input File Processed (useful if running pidiq_calc2() iteratively over multiple images)
Well For seedling screens, equivalent to crop/individual plant region extracted from the image (if multi-plant)
Well_GreenArea Sum of ‘green’ / healthy tissue (determined using spectral filtering presets).
Well_YellowedArea Sum of ‘yellow’ / diseased tissue (ditto)
Well_Sum Sum of green + yellow pixels, i.e. size of the plant extracted from the image
Well_Prop Well_Sum / Total Area (in pixels) of the respective crop region - if > 1 then plant is overgrown, if < 0.1, then plant is suspiciously small. ‘Empty’ wells will be ‘NA’
Well_RawYellow Proportion of Yellowed Plant Area = Well_YellowedArea / Well_Sum
Well_ArcYellow Normalized ArcSin transformed Well_RawYellow: ArcSin(Well_RawYellow) / ArcSin(1)
  • Well_ArcYellow is what you will want for quantifying disease, however Well_Prop is very useful for identifying potential problematic wells for exclusion, and with appropriately named File and Well columns, you can combine these results with any additional metadata you want!

2.3.2 Modified Images

In the output directory, ./test_pidiq2/, depending on whether you set test_plt = TRUE, you will also find 4 output images for your perusal:

`pidiq_calc2() - Output Images
File Suffix Default? Description
<filename>_painted.png Yes Final Image with Green (= Green) and Yellowed (= Red) Pixels Highlighted
<filename>_seggrp_final.png Yes Final Segmentation Groups And Individual Crop Assignments
<filename>_seggrp_orig_bw.png No (test_plt = TRUE) Preliminary Neighbouring-Pixel Segmentation Groups
<filename>_seggrp_orig_watershed.png No (test_plt = TRUE) Preliminary Watershed Segmentation Groups

2.4 Exploring Results

These images give us an important sanity-check on how well pidiq_calc2() did at distinguishing the individual plants, and what actually went into calculating the final PIDIQ stats.

Preliminary Neighbouring-Pixel Segmentation Groups: ./test_pidiq2/DSC_0788.JPG_cropped.jpg_seggrp_orig_bw.png

These are the preliminary segmentation groups (indicated by colour) produced by joining neighbouring pixels which passed initial green-pass filtering. Evidently, a lot of plants overlap, and only a few distinct petals can be reliably assigned to a crop region.

Preliminary Watershed Segmentation Groups: ./test_pidiq2/DSC_0788.JPG_cropped.jpg_seggrp_orig_watershed.png

A second-pass of watershed segmentation produces a greater level of granularity, perfect for identifying individual leaves of plants. As above, colours indicate distinct segmentation groups, and points indicate their midpoints. These groups will be used for identifying the problematic areas where plants overlap and will be used for reassigning pixels to specific crops. At the moment, if these latter regions cannot be resolved to a given crop region, then they are excluded.

Preliminary Watershed Segmentation Groups: ./test_pidiq2/DSC_0788.JPG_cropped.jpg_seggrp_final.png

The final, resolved segmentation groups, individually labelled according using the provided row_lab and col_lab arrays. pidiq_calc2() successfully identified an empty well in the last column of the second row (see the results data.frame above). But there are many problems and mis-assignments resulting from crowding and the irregular placement of plants, particularly in the first row (from plants A5: A10). Note that a good proportion of plant tissue is also discarded because of significant overlap.

2.5 PIDIQv2 On A Batch Of Images

So, now we’ve seen how PIDIQ works on a single (multi-plant) image. What about if we have a directory of images to quantify?

Just like with the example we’ve just seen, we’ll first begin by making sure our images only contains the flat/plate region containing plants - cropping if necessary (with the pidiq plate cropper GUI, or any other image editor):

#Get the paths to the cropped images:
img_dir <- './photos/'
img_paths <- paste0(img_dir, list.files(img_dir, pattern = '_cropped'))

#img_paths

#Run pidiq_calc2() iteratively over image_paths, saving results for each file into a list:

start <- Sys.time()
res_test_list <- lapply(img_paths, pidiq_calc2, 
       #Supplied input parameters:
       #img_file - this is given by lapply
       output_dir = './test_pidiq2/', #Keeping the output directory to default.
       gsize_cut = 100, #Set from previous test-run.
       #test_plt = TRUE, #Enabling the generation of intermediate test images (disable to speed up running time).
       col_lab = 1:10, #Column labelling.
       row_lab = LETTERS[1:4], #Row labelling.
       filter = 'arabidopsis')

stop <- Sys.time()
#Now, merge all of the results together into a data.frame:
res_test2 <- do.call('rbind.data.frame', res_test_list)

dim(res_test2)
## [1] 320   8
#How much time did it take?
stop - start
## Time difference of 1.523462 mins
#About 0.24 seconds per plant:
as.numeric(stop - start)* 60 / (48 * 8)
## [1] 0.238041

3 Conclusions

3.1 Recommendations

It is important to note that the PIDIQv2 algorithm was originally developed for 48-well seedling assays. This dataset highlights some important considerations to make if you decide to apply pidiq_calc2() to sprayed flats of fully-grown plants.

Are You Thinking Of Using pidiq_calc2() On Multi-Plant Images?

  1. Are your plants regularly spaced, following an orderly grid-layout on the flat?
  2. Are your plants overgrown/overlapping?

As we could see from the example flat image above, irregularly spaced plants can result in mis-assignment of image segmentation groups, and overlapping plants might result in exclusion of ambiguously located segmentation groups. In the end both factors will cause inaccurate calculation of PIDIQ summary stats and your quantified experimental data will be unreliable, if not unusable for analysis.

If you find that overgrown plants are posing a significant problem and the results of segmentation are undesirable, then you could alternatively crop individual plants out of your image (I have a helper function to automate this process if you need it) and iteratively running pidiq_calc2() on them in single-image mode (row_lab & col_lab = NULL). This was basically the approach Alex used when applying the older PIDIQ ImageJ macro (without segmentation) to quantify 48-well A. thaliana seedling plates.

3.2 Future Developments

While reflecting on this analysis, I have come up with some additional features for PIDIQv2 that will give users a greatly improved multi-plant analysis experience:

  • Optimized Image Segmentation Tools: applying visualization of grids during image cropping, and allowing users to perform custom plant sub-region selection for irregularly planted flats.
  • Improved Reproducibility: Using log-files to track input parameters used for each analysis run, and flagging problematic images based on irregularity, overlap, and loss of segmentation groups.
  • Improved User-Interactivity: an interactive GUI interface to pidiq_calc2() that will allow users to supervise the algorithm at each stage of analysis and to finalize segmentation group assignments when plants overlap.

Some of these problems could also be corrected (by not entirely obviated) by incorporating a deep-learning model for performing image segmentation, although resolving overlapping images is still a significant challenge in this field. Another disadvantage of this approach is the requirement of training data of pre-segmented images of plant hosts and stages of development, and so it can’t be used out of the box, but it might be possible to overcome this limitation through the use generative autocorrelation models to create simulated datasets.

FIN: Take care when planning your experiments and be consistent in planting your flats, it will save you a lot time and frustration at the end of the line! Have fun!